fix double-decode and double-encode in byte stuffing codec#6
fix double-decode and double-encode in byte stuffing codec#6jktjkt wants to merge 1 commit intogertvdijk:developfrom
Conversation
Calling the replacement function multiple times is wrong because it looks at partially decoded data which are decoded again. We could fix that by using a different order for encode and decode, but I like this implementation much more because it will also flag unrecognized or truncated stuffing sequences.
|
Thanks! I like how it fixes an actual bug and catches invalid cases with the exception raised. Also your test cases using What I don't really like is the while-loop that's seem not really Pythonic. I think this is a great use case for a generator function which can be extracted from the I've given it a quick refactor and not much is left as-is with these edits, so I'm leaning towards adopting this approach "my way" and proposing that as a PR here for you to review. However, it remains your idea and I'd like it to be reflected in the commit history. Would you prefer to be the author of the commit or a co-author? Alternatively, if you would like to adopt the idea refactor below in the expandable section and amend your commit with it, that would be great too! while at it, perhaps a line on what 'stuffing' is (escaping basically) could be added too. Something like this:a class variable with the stuffable bytes to avoid duplication: _stuffable_bytes: ClassVar[frozenset[int]] = frozenset({
constants.ByteCode.STUFFING.value,
constants.ByteCode.ACK.value,
constants.ByteCode.START_FROM_METER.value,
constants.ByteCode.START_TO_METER.value,
constants.ByteCode.STOP.value,
}) @classmethod
def _iter_destuffed_bytes(cls, stuffed: bytes) -> Iterator[int]:
"""Yield destuffed byte values from a stuffed byte sequence."""
in_stuffing = False
for raw_byte in stuffed:
if in_stuffing:
in_stuffing = False
xored = raw_byte ^ 0xFF
if xored not in cls._stuffable_bytes:
raise InvalidStuffingByteError(raw_byte=raw_byte)
yield xored
continue
if raw_byte == constants.ByteCode.STUFFING.value:
in_stuffing = True
continue
yield raw_byte
if in_stuffing:
raise TruncatedStuffingError
@classmethod
def _iter_stuffed_bytes(cls, raw: bytes) -> Iterator[bytes]:
"""Yield stuffed byte chunks from an unescaped byte sequence."""
stuffing_prefix = constants.ByteCode.STUFFING.value.to_bytes(1, "big")
for raw_byte in raw:
if raw_byte in cls._stuffable_bytes:
yield stuffing_prefix + (raw_byte ^ 0xFF).to_bytes(1, "big")
else:
yield raw_byte.to_bytes(1, "big")
[...]then decoding would be data_bytes = frame[1:-1]
return cast(DataLinkBytes, bytes(self._iter_destuffed_bytes(data_bytes)))and encoding stuffed = b"".join(self._iter_stuffed_bytes(raw))
frame = (
self._start_byte.to_bytes(1, "big")
+ stuffed
+ constants.ByteCode.STOP.value.to_bytes(1, "big")
)
return cast(PhysicalBytes, frame)Let me know what you think. |
|
I agree that a generator is a nice improvement, thanks. A I'll read the updated PR and test this tomorrow (we have some heat meters which triggered the double decoding during an actual log readout). Let's close this one in favor of #7. |
Calling the replacement function multiple times is wrong because it looks at partially decoded data which are decoded again. We could fix that by using a different order for encode and decode, but I like this implementation much more because it will also flag unrecognized or truncated stuffing sequences.